CREPE: A Convolutional Representation for Pitch Estimation

نویسندگان

  • Jong Wook Kim
  • Justin Salamon
  • Peter Li
  • Juan Pablo Bello
چکیده

The task of estimating the fundamental frequency of a monophonic sound recording, also known as pitch tracking, is fundamental to audio processing with multiple applications in speech processing and music information retrieval. To date, the best performing techniques, such as the pYIN algorithm, are based on a combination of DSP pipelines and heuristics. While such techniques perform very well on average, there remain many cases in which they fail to correctly estimate the pitch. In this paper, we propose a data-driven pitch tracking algorithm, CREPE, which is based on a deep convolutional neural network that operates directly on the time-domain waveform. We show that the proposed model produces state-of-the-art results, performing equally or better than pYIN. Furthermore, we evaluate the model’s generalizability in terms of noise robustness. A pretrained version of CREPE is made freely available as an open-source Python module for easy application.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BLIND PARAMETER ESTIMATION OF A RATE k/n CONVOLUTIONAL CODE IN NOISELESS CASE

This paper concerns to blind identification of a convolutional code with desired rate in a noiseless transmission scenario. To the best of our knowledge, blind estimation of convolutional code based on only the received bitstream doesn’t lead to a unique solution. Hence, without loss of generality, we will assume that the transmitter employs a non-catastrophic encoder. Moreover, we consider a c...

متن کامل

Convolutional Pitch Target Approximation Model for Speech Synthesis

In this paper, we investigate pitch contour modelling in speech synthesis based on segmental units. A convolutional pitch target approximation model is proposed. This model allows jointly stochastic modelling of framewise pitch and pitch contour of longer units, of which the intuitive relations are revealed by a convolutional target approximation filter. The pitch contour is stylized by a linea...

متن کامل

Estimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks

Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. T...

متن کامل

Multi-pitch estimation by a joint 2-d representation of pitch and pitch dynamics

Multi-pitch estimation of co-channel speech is especially challenging when the underlying pitch tracks are close in pitch value (e.g., when pitch tracks cross). Building on our previous work in [1], we demonstrate the utility of a two-dimensional (2-D) analysis method of speech for this problem by exploiting its joint representation of pitch and pitch-derivative information from distinct speake...

متن کامل

Crepe Complete: Multi-objective Optimization for Your Models

Search-based software engineering views software development as a process of searching through the design space for an optimal solution according to some quality criteria. It seems natural to try and build automated implementations of this idea based on concepts from model-driven engineering—using meta-models as characterisations of design spaces and model transformations as algorithms / heuris...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1802.06182  شماره 

صفحات  -

تاریخ انتشار 2018